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, Abstract. An and/or tree is a binary plane tree, with internal nodes 

Q,i labelled by logical connectives, and with leaves labelled by literals cho- 

sen in a fixed set of k variables and their negations. Pick up uniformly 
at random such a Boolean tree with n leaves, and consider the Boolean 
function it represents. Finally, let the size n of the trees tend to infinity. 
This process defines a random distribution on Boolean functions of k 
variables, named the Catalan tree distribution. It has long been stud- 
ied in the literature, however quantitative results were obtained only by 
taking in a last step an infinite limit for k. 

In the present paper, we investigate the global model such that the num- 
pH ' ber of variables k n is a function of n. We describe the whole range of 

the probability distributions depending on the function k n , as soon as 
it tends jointly with n to infinity. In this context, we exhibit a thresh- 
old M n , equivalent to n /inn, such that, when k n becomes larger, then the 
probability distribution becomes stable. 

To study this model, we mainly use analytic combinatorics and we extend 
the Kozik's pattern theory, first developed for the Catalan tree model. 

in 

Keywords: Random Boolean expressions; Boolean formulas; Boolean 
^*1 ' functions; Probability distribution; Analytic combinatorics; Complexity. 

in ' 



< 



O 
U 



a 



o 
m 



1 Introduction 

Pick up uniformly at random a large Boolean expression and focus on the 
Boolean function it represents. How random is this Boolean function? E.g., what 
is the probability to get a satisfiable function? or any given function? Former 
^^ | results based on specific Boolean expressions (the variables and the connectives 

H ■ used to build the expressions are fixed and finite sets) highlight a relation be- 

tween the complexity of a function and its probability. 

The first approach, by Lefmann and Savicky p.], consists in fixing a finite 
set of variables, allowing the two logical connectives and and or and choosing 
uniformly at random a Boolean expression of size n in this logical system. Lef- 
mann and Savicky first proved the existence of a limiting probability distribution 
on Boolean functions when the size of the random Boolean expression tends to 
infinity. Since the seminal paper by Chauvin et al. [2], almost all quantitative 
studies of such a Boolean distributions are deeply related to analytic combi- 
natorics: a survey by Gardy [3] provides a wide range of models with various 
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numerical results. Later, Kozik [4J proved a strong relation between the limiting 
probability of a given function and its complexity (i.e. the minimal size of an 
expression representing the function). His approach lies in two steps: (1) first let 
the size of the Boolean expressions taken into consideration tend to infinity, and 
then (2) let the number of variables used to label the expressions tend to infin- 
ity. His powerful machinery, the pattern theory, easily classifies and counts large 
expressions according to structural constraints. This theory, already adapted to 
other logical systems [5jj will also be extended in this paper. 

In order to swap the two ordered limits (1) and (2) (on the size of the ex- 
pressions and then on the number of variables) Genitrini et al. [617] presented 
another model of random expressions built on a infinite set of variables: a notion 
of equivalence classes of expressions is needed and introduced by the authors. 
Though some interesting similarities between this new model and the finite one 
have been observed, no direct link has been explained. 

This paper presents a more general model, which unifies in a unique approach 
both previous models. By using a slightly different equivalence relationship on 
Boolean expressions, we manage to let both the number of variables and the 
size of the formulas, tend jointly to infinity. We let the number of variables be 
a function of the size of the expressions and exhibit some threshold: as soon as 
the number of variables is large enough compared to the size of the expressions, 
the general behaviour of the induced probability on the set of Boolean functions 
does not change anymore by adding more variables. 

We focus on the logical context of and/or connectives in order to adapt the 
pattern theory of Kozik and because of the richness of this logical system (nor- 
mal forms, functional completeness). However the implicational logical system 
(e.g. |8I7| ) could also be studied in this new context and we deeply believe the 
general behaviour to be identical. 

The paper is organized as follows. Section [2] introduces our unified model 
based on equivalence relation of Boolean expressions. Then, Section [3] states our 
two main results: (1) the link between the probability of a class of functions 
and the complexity of the functions taken into account; (2) the behaviour of the 
probability related to the dynamic between the number of variables and the size 
of the expressions. Section|4]is devoted to the technical core of the paper. Finally 
Section [5] applies our approach to and/or trees and proves the main results. 

Almost all proofs are given in the appendices. 

2 Probability distributions on equivalence classes of 
Boolean functions 

2.1 Contextual definitions 

A Boolean function is a function from {0, 1} N into {0, 1}. The set of Boolean 
functions is denoted by J- '. In the following, {x\,X2, . . . } will be an element of 
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{0, 1} N . A variable Xi can be negated: Xi = 1— Xi, and we call literal a variable or 
its negation. The two connectives taken into account, and and or, are respectively 
denoted by A and V. 

An and/or Boolean expression is seen as an and/or tree i.e. a binary plane tree 
with leaves labelled by a literal and with internal nodes labelled by a connective. 
Each and/or tree computes (or represents) a Boolean function. Obviously an 
infinite number of and/or trees are computing the same Boolean function. The 
size of an and/or tree is its number of leaves: remark that, for all n > 1, there 
is an infinite number of and/or trees of size n. 

The complexity of a Boolean function /, denoted by £(/), is defined as 
the size of its minimal trees, i.e. the smallest trees computing /. Although a 
Boolean function is defined on an infinite set of variables, it may really depend 
only on a finite subset of essential variables: given a Boolean function /, we say 
that the variable x is essential for /, if and only if /| x <_o ^ f\x<-i (where /| x <_ a 
is the restriction of / to the subspace of {0, 1} N where x — a). We denote by 
E(f) the number of essential variables of /. Remark that the complexity and 
the number of essential variables of a Boolean function are only related by the 
following inequality: E(f) < L{f). 



2.2 Equivalence relations 

Analytic combinatorics' tools (cf. [9]) are based on the notion of combinatorial 
classes. A combinatorial class is a denumerable (or finite) set of objects on which 
a size notion is defined such that each object has a non-negative size and the 
set of objects of any given size is finite. Thus our class of and/or trees is not 
a combinatorial class. To use the analytic combinatorics' tools, we define an 
equivalence probability distribution on Boolean trees. 

The following equivalence relation is distinct from the one of |6|7| . because 
their logical context does not allow negated variables. In the rest of the paper, 
we define a tree-structure to be an and/or tree in which leaves labels have been 
removed (but internal nodes remain labelled). 

Definition 1. Let A and B be two and/or trees. Trees A and B are equivalent 

if (1) their tree- structures are identical, (2) two leaves are labelled by the same 
variable in A if and only of they are labelled by a same variable in B , and (3) 
two leaves are labelled by the same literal in A if and only of they are labelled by 
a same literal in B. 

This equivalence relationship on Boolean trees induces straightforwardly an equiv- 
alence relationship on Boolean functions. 

For example, both functions (xi)i>\ H > X2013 and (xj)i>i <— > x\ are equivalent. 
An important remark is that all functions of an equivalence class have the same 
complexity and the same number of essential variables. In the following, we will 
denote by (/) the equivalence class of the Boolean function /. 



2.3 Probability distribution 

In the following, k n is the maximum number of different variables that can 
appear as labels of a and/or tree of size n. We assume that the sequence k n is 
increasing and tends to infinity as n tends to infinity. 

Definition 2. We denote by T n the number of equivalence classes of trees of 
size n in which at most k n different variables appear as leaves labels. We define 
the ordinary generating function T(z) as T(z) — Y] n T n z n . 

Proposition 1. The number of classes of trees of size n satisfies: 



t»=c„. £{;}*->- 



where C n is the number of non labelled binary ireeQ of size n and { n j is the 
Stirling number of the second kinc^. 

Proof. Once the structure of the binary tree is chosen (factor 2 n ~ 1 C„), we par- 
tition the set of leaves into p parts such that two leaves that belong to the same 
part are labelled by the same variable: it gives the contribution {"}. Then, we 
choose to label each leaf by a positive or negative literal (contribution 2"). The 
equivalence relationship states that a tree and the one obtained by replacing 
the positive literals corresponding to a fixed variable by its negative literal (and 
conversely) are equivalent. Thus, for each class we double-count the number of 
trees (correction 2~ p ). □ 

Given a set S of equivalence classes of trees and S n the number of elements 
of S of size n, we define the ratio of S by /i rl (<S) = ff-. For a given Boolean 
function /, we denote by T n (f) the number of equivalence classes of trees of 
size n that compute a function of (/), and we define the probability of (/) as 

P»</> = ^P- 

One goal of this paper consists in studying the behaviour of the probabilities 
(P n (/))/ e jr when the size n of the trees tends to infinity. 

3 Results 

We state here our main result: the behaviour of P n (/) for all fixed function 
/ € J- in the framework of and/or trees. Saying that / is fixed means that its 
complexity is independent from n. 

The main idea of this part is that a typical tree computing a Boolean func- 
tion f is a minimal tree of f in which has been plugged a large tree, that does 
not distort the function computed by the minimal tree. 



4 In Proposition [T] C n is the (n — l)th Catalan number (see e.g. [21 p. 6-7]). 
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Definition 3. Let (/) be a fixed class of Boolean functions. We denote by L(f) 
(resp. E(f)) the common complexity (resp. number of essential variables) of the 
functions of (/). The multiplicity of the class (/), denoted by R(f) , is the 
number L(f) — E(f): it corresponds to the number of repetitions of variables in 
a minimal tree of (/) . 

Theorem 1. There exists a sequence (M n )„>i with M n ~ j^ and such that, 
for all fixed class (/) of Boolean functions, there exists a positive constant X/f\ 
such that the probability of (/) satisfies, asymptotically when n tends to infinity, 

( . \ «(/>+! 
A </> ' FT7 ' V' f or lar 9 e enough n, k n < M n ; 

Vn(f) = < ) + <R(f)+i 



A </> ' (iSiTrJ otherwise. 

Let us first remark that the constant A/y\ is independent from k n (and from n). 
This result has been observed, without explanation, in |6I7| . 
We note also that both constant functions true and false are alone in their re- 
spective equivalence classes, and we define their complexity to be 0. 

In the finite context |2|4j , each Boolean function was studied separately in- 
stead of being considered among its equivalence class. However, the finite context 
is linked to the particular case of our model where there exists an fixed integer 
k such that k n = k for all n > 1. We can translate the result obtained by Kozik 
in terms of equivalence classes by summing over all Boolean functions belonging 
to a given equivalence class: remark that there are ( E ( f))2 E functions in the 
equivalence class of a given Boolean function /, therefore, the result of Kozik is 
equivalent to: for all fixed Boolean function (/), asymptotically when k tends to 
infinity, 

lim P„, fc (/) = (-—1——) =ef 



Concerning the infinite context |6I7| . we notice that the cases such that k n 
is larger than n are equivalent to the model k n — n, even if k n = oo. 



4 Technical key points 

Next we state the technical core of our results, and we demonstrate how a thresh- 
old does appear according to the behaviour of k n as n tends to infinity. 

4.1 Threshold induced by fc„'s behaviour 

Definition 4. Letn be a positive integer. The sequence (a P )pe{i,...,ri} — ( e t2 _p ] 

is unimodal. More precisely, there exists a integer M n such that (a p ) p is strictly 
increasing on {1, 2, . . . , M n } and strictly decreasing on {M n + 1, . . . , n}. 



Note that the sequence (a p ) is related to the terms T n , cf. Proposition [TJ Al- 
though the sequence (o p ) is not directly equal to the Stirling numbers of the 
second kind it is obviously linked to it (cf. next Proposition [2]). Therefore we can 
expect the same kind of behaviour for their maximum (cf. |10lll| ). 

Lemma 1. The sequence (M n ) n is increasing and asymptotically satisfies: 

n 

M n ~ . 

Inn 

The proof can be adapted from the approach of Harper [12J. However, simpler 

arguments are exhibited in Appendix lAl 

Definition 5. Let us define the following quantity: B n ^ n = Yl v =l ( n }2 _p . The 
number B n ^k„ quantitatively represents the labelling constraints of leaves-labelling 
by variables (cf. Proposition^. 

Using the following proposition, we will further exhibit bounds on -B n ,fc„- 
Proposition 2 (Comtet, 74). For all n > 1, for all p £ {1, . . . , n}, 

P n (p-i) n < {A < p n 



p\ 0-1)! \p) p\ 

These inequalities can be seen as some specific case of Bonferroni inequalities 
(see [13, Section 4.7]). For a simpler proof refer to Sibuya |14| . 

Next lemma is dedicated to understand the asymptotic behaviour of B n ^ n '■ 
roughly speaking, before the threshold: k n < M n , B n ,fc n * s equivalent to its last 
term, and after M n , it is equivalent to the sum of a few terms around M n . 

Lemma 2. Let (u n ) n be an increasing sequence tending to infinity. Then, asymp- 
totically: 

u n 
B nu ~ — —2 u " if u„ < M n for large enough n. (1) 

/M n + Vn ^ n \ 

if Un > M n for large enough n, (2) 

where r\ n = min{ln n,u n — M n }. 

This lemma is proved in Appendix |A"1 

Lemma 3. Let us assume that k n < M n for large enough n, then, asymptotically 
when n tends to infinity, 

B n,k n+ i _q f * 




B n +l,k n +i \k n +l. 

Lemma 4. Let us assume that k n > M n for large enough n, then, asymptotically 
when n tends to infinity, 

B n ,k n+1 _ n /Inn 



B n +i,k n+1 \ n 

Definition 6. Let the ratio rat n be the quantitative evolution of the leaves- 
labelling contraints from trees of size n to size n + 1: rat n = B »,fc„ + i/s n+ i lfcn . 
Its asymptotic behaviour has been quantified in Lemmas\3\ andU\ 



4.2 Adjustment of Kozik's pattern language theory 

In 2008, Kozik [4] introduced a quite effective way to study Boolean trees: he 
defined a notion of pattern that permits to easily classify and count large trees 
according to some constraints on their structure. Kozik applied this pattern 
theory to study and/or trees with a finite number of variables, but this pattern 
theory has been extended to different models of Boolean trees (see for example 
paper [5]). 

Let us adapt the definitions of patterns to our new model and then prove 
extended results of Kozik's paper. 

Definition 7. A pattern language is a set of binary trees with internal nodes 
labelled by A orV and with external nodes labelled by • or n. Leaves labelled by • 
are called pattern leaves and leaves labelled by □ are called placeholders. 

Given a pattern language L and a family of trees M., we denote by L[M] the 
family of all trees obtained by replacing every placeholder in an element from L 
by a tree from M. . 

The generating function of a pattern L is £(x,y) = J2d « L(d,p)x y p , where 
L(d,p) is the number of elements of L with d pattern leaves and p placeholders. 

Definition 8. We define the composition of two pattern languages L[P] as the 
pattern language of trees which are obtained by replacing every placeholder of a 
tree from L by a tree from P. 

Definition 9. A pattern language L is sub-critical for a family Ai if the gen- 
erating function m{z) of Ai has a square-root singularity t, and if £(x, y) is 
analytic in some set {(x, y) '■ \x\ < t + e, \y\ < to(t) + s} for some positive s. 

Definition 10. Given an element of L[M], its number of L -repetitions is the 

number of its L-pattern leaves minus the number of different variables that appear 
in the labelling of its L-pattern leaves. The number of its L-restrictions is 
the number of its L-pattern leaves that are labelled by essential variables of the 
function computed by the tree, plus the number of its L -repetitions. 

On the left-hand side of Fig. [T] we have depicted a pattern tree that computes 
the constant function true whatever the placeholder is replaced by. It exhibits 
one repetition (of the variable x\) and thus one restriction since the function 
true has no essential variables. 

Definition 11. Let T be the family of the trees with internal nodes labelled by 
a connective and leaves without labelling, i.e. the family of tree- structures. 

The generating function of I satisfies L(z) = z + 2L(z) 2 , which implies L(z) = 
(1 — yl — 8z)/4 and its dominant singularity is !/s. 

The following key-lemma is a generalization of Kozik's one [U Lemma 3.8]: 
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Fig. 1: Left: a pattern tree that computes the function true. Right: a simple 
tautology. 



Lemma 5. Let L be an unambiguous pattern, and T the families of and/or 
trees. Let T„ (resp. Tn ) be the number of labelled (with at most k n variables) 
trees of L[T] of size n and with r L-repetitions (resp. at least r L-repetitions). 
We assume that L is sub-critical for the family T of the unlabelled-leaves trees. 
Then, asymptotically when n tends to infinity, 



T 



r] 



-7^ = (ra^) 

J- n. 



and 



T 



>r] 



Tn 



= 0(rat;). 



Proof. The number of labelled trees of L[T] of size n and with at least r l-- 
repetitions is given by: 



T ^ = X! I n (d)Lab(n,k n ,d,r), 

d=r+l 

where I n (d) is the number of tree-structures with d L-pattern leaves and the 
number Lab(n,k n ,d,r) corresponds to the number of leaves-labellings of these 
trees giving at least r L-repetitions. The following enumeration contains some 
double-counting and we therefore get an upper bound: 



Lab(n,k n ,d,r) < 2™ • ^ 



d 

r + j 



r + j 
j 



D 



n—r—j-\-l,k n 



The factor 2™ corresponds to the polarity of each leaf (the literal labelling is 
positive or negative); the index j stands for the number of different variables 
involved in the r repetitions; the binomial factor chooses the pattern leaves that 
are involved in the r repetitions; the Stirling number partition splits r+j leaves 
into j parts; finally, the factor B n _ r _j + \^ n chooses which variable is assigned 
to each class of leaves. Therefore. 



T l>r] < 2 « . B n _ rifcB J2 

3 = 1 



r+j 



i=r+j 



In(d) 



d 

r+j 



Let £(x, y) be the generating function of the pattern L. Then, for all p > 0, 



y n=i d=i vv 



Thus, 



T£ r] B n - r , kn ^ fr+jl [z"]z^j^(z,/(z)) 
T„, fc „ " B ntkn 2^\ j J \z n ]I{z) 

Since z r+j ^^(z,I(z)) and /(z) have the same singularity because of the sub- 
criticality of the pattern L according to X, the previous sum is constant when n 
tends to infinity and so we conclude: 

71H rp[>r] / R 



o 






5 Behaviour of the probability distribution 

Now that we have adapted the pattern theory to our model, we are ready to 
quantitatively study it. A first step is to understand the asymptotic behaviour of 
P n (true). It is indeed natural to focus on this "simple" function before considering 
a general class (/); and moreover, it happens to be essential for the continuation 
of the study. In addition, the methods used to study tautologies (mainly pattern 
theory) will also be the core of the proof for a general equivalence class. We 
prove in this section the main Theorem Q] for both classes (true) and (false) of 
complexity zero, using the duality of both connectives A and V and both positive 
and negative literals. The main ideas of the proof for a general equivalence class 
will be detailed in Section [5?2| but the details will be postponed into Appendix [Cl 



5.1 Tautologies 

Let us recall that a tautology is a tree that represents the Boolean function 
true. Let us consider the family A of tautologies. In this part, we prove that the 
probability of (true) is equivalent to the ratio of a simple subset of tautologies. 

Definition 12 (cf. right-hand side of Fig. [l]). A simple tautology is an 

and/or tree that contains two leaves labelled by a variable x and its negation 
x and such that all internal nodes from the root to both leaves are labelled by 
V -connectives. We denote by ST the family of simple tautologies. 

In order to prove Theorem[T]for the class (true) and even to give the more precise 
result P„(true) ~ 3 /4 • rat n , first we compute the ratio of simple tautologies. 
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Lemma 6. The ratio of simple tautologies verifies 

ST 3 

fj, n (ST) = ■ ~ — rat„, when n tends to infinity. 

Moreover, asymptotically when n tends to infinity, almost all tautologies are 
simple tautologies. 

Proof. The proof is divided in two steps. The first one is dedicated to the com- 
putation of the ratio /j, n (ST). The second part of the proof shows that almost 
all tautologies are simple tautologies. 

Let us consider the non-ambiguous pattern language S = *\S V S\ □ A □. 
Remark that a tree, such that two S'-pattern leaves are labelled by a vari- 
able and its negation, is a simple tautology. The generating function of S is 
s(x, y) = |(1 — y/l — 4(x + y 2 )). It is sub-critical for I. The generating function 
I(z) = ^ d2 /dx 2 (s(xz,I(z))\ x= i enumerates and/or trees with two marked dis- 
tinct leaves. Therefore, 2 n ~ 1 / n _B n _i is the number of simple tautologies where 
we count twice simple tautologies realized simultaneously by two pairs of leaves. 
The ratio of this family, with double-counting and denoted by DC, is given by 

on-l f 
Vn(DC) ' 



and using a consequence of [U Theorem VII. 8] (cf. a detailed proof in [7]): 

lim — = lim — — — - = 3. 

n->oo I n 2 ^i Jl(z) 

Thus, we get the upper bound |rat„ for the ratio of simple tautologies. 
It remains to deal with the double-counting in order to compute a lower bound. In 
the family DC, simple tautologies, realized by a unique pair of leaves, are counted 
once, those that are realized by two pairs of leaves are counted twice, and so on. 
Let us denote by ST 1 the family of simple tautologies counted exactly i times. 
Inclusion-exclusion principle gives: ST n = DC — J2i>i(~ I) 1 ' ST^. Moreover, it 
can be seen that a tree in ST 2 (resp. in ST 3 , resp. ST 1 ) has at least 2 (resp. 3, 
resp. [2\/i — lj ) S-repetitions. Therefore, by Lemma [5j the ratio of the family 
ST 1 is: 

/j, n (ST l ) = O I (rat„) ) , when n tends to infinity. 



Thus, M„(.DC)-/i„(ST) 



]T(-ir+Vn(ST* 



<]T M „(ST*) + ^>n(ST*) 




Consequently, asymptotically, /j, n (ST) — fi n (DC) + o (rat„) ~ 3 /4 • rat„. 
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Let us now turn to the second part of the proof: asymptotically, almost 
all tautologies are simple tautologies. Let us consider the pattern A — »\N V 
7V| □ AA. This pattern is unambiguous, its generating function verifies n(x, y) = 
x + n(x, y) 2 +y-n{x, y) and is thus equal to b(1 — y— \/(l — y) 2 — 4a;). It implies 
that A is sub-critical for the family I of tree-structures. 

A tautology has at least one A [TV] -repetition, otherwise, we can assign all 
its iV-pattern leaves to false and, the whole tree computes false: impossible for 
a tautology. 

Consider a tautology t with exactly one A [A] -repetition, this repetition must 
be a x\x repetition and must occur among the A-pattern leaves, using the same 
kind of argument than above. 

Then, let us assume that there is an A-node denoted by v between the ./V-pattern 
leaf x and the root of the tree. This node v has a left subtree t\ and a right subtree 
£2- Assume that the leaf x appears in t\. Then, one can assign all the A-pattern 
leaves of £2 (which are A[A]-pattern leaves of £) to false, since there is no more 
repetition among the A[A]-pattern leaves of £. Also assign all the pattern leaves 
of t minus the subtree rooted at v to false. Then, we can see that £ computes 
false: impossible. We have thus shown that £ is a simple tautology. 

Finally, tautologies with exactly one A [A] -repetition are simple tautologies, 
a tautology must have at least one A[A]-repetition and, thanks to Lemma [SJ 
tautologies with more than one A [A] -repetitions have a ratio of order o (rat n ), 
which is negligible in front of the ratio of simple tautologies. □ 

5.2 Probability of a general class of functions 

With similar arguments than those used for tautologies, we prove that the prob- 
ability of the class of projections (i.e. (a?i)i>i H> Xj) is equivalent to 5 /s • rat„. 
The proof is detailed in Appendix [Bj 

Let us turn now to the general result: the behaviour of P n (/) for all fixed 
/ G J- . The main idea of this part is that, roughly speaking, a typical tree 
computing a Boolean function in (/) is a minimal tree of (/) in which has been 
plugged a single large tree. The goal of this section is to give the main ideas of 
the proof of Theorem [1] the complete proof is given in Appendix [Cl 

Proof (sketch). For a given class of Boolean functions (/) our goal is to obtain 
an asymptotic equivalent to P n (/). 

— We first define several notions of expansions of a tree: the idea is to replace 
in a tree, a subtree S by T A 5 1 , where T is chosen such that the expanded 
tree still computes the same function. 

— The ratio of minimal trees of (/) expanded once is of the order of rat„ 

— The ratio of trees computing a function from (/) is equivalent to the ratio 
of minimal trees expanded once. 

The most technical part of the proof is the last one, because we need a precise 
upper bound of P„(/). But the ideas are more or less the same as those developed 
for the class (true). □ 
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6 Conclusion 

We studied a new model of and/or trees which is the first one (to our knowl- 
edge) to allow the number of variables to depend on the size of the trees into 
consideration. 

Choosing the context of and/or trees let us to generalize the powerful Kozik's 
pattern theory, but we are convinced that all our results also hold in implica- 
tional models or in non-binary or non-plane models. Indeed, the key idea is that 
each repetition induces a factor rat n , and this remains true in all those models 
- although pattern theory does not adapt to every model, e.g. models with im- 
plication. Extending our results to these models would give nice unifications of 
the known results of the literature: papers |4I8I7| and |15I5| . 
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A Proofs of the technical core 

Proof (of Lemma\^jj. Let p be an integer in {1, . . . , n — 1}. By Definition [¥J 

a P+i _ (p+^ 



* p y p J 2(p+l)' 

and consequently, for large enough n, 

^±i>l <=► nln(P±±)-\n(2(p+l))>0. 
a p \ P J 

The function (p n- nlnl^^J — ln(2(p + 1))) is strictly decreasing. Since it 

tends to +oo at p — 1 and to — oo at p = n — 1, both when n tends to infinity, 
there exists a unique M n such that (a p ) is strictly increasing on {1, ... , M n } and 
strictly decreasing on {M n + 1, . . . , n). 
Let us denote by x n the single solution of equation: 

~) 2(^TT) = L (3) 

Since, asymptotically when n tends to infinity, 

E7I + 1Y" 1 Inn 



r2- I 2(r2-+l) 2 ' 

Inn / Mnn ' 

we have that n/ Inn < a;„ and therefore, x„ tends to infinity. Thus, Equation ([3]) 
evaluated in x n is equivalent to 

n In ( H j = In 2 + ln(.T„ + 1), 

which implies x n In x n ~ n when n tends to infinity. We easily deduce from this 
asymptotic relation that In x n ~ In n and that x n ~ t^- when n tends to infinity. 
Since M n = [x n \ , we conclude that M n ~ ™/in n when n tends to infinity. □ 

In view of Proposition [2] we have the following bounds: 

2 • E ^ + ^7 ^ ^ - ^^- (4) 

p—1 ' p— 1 

Proof (of Lemma\^ assertion (JTJ)^. When u„ < M n , for large enough n, let us 
prove that both bounds of Equation f4]) are of the same order as n tends to 
infinity. Let us first prove that both bounds are equivalent to their last (and 
common) term, namely u™ • (u n \ 2"")~ 1 . 

Let us denote by S Un -i, the sum X) P =i a p- Let £ be positive. We define 5 n 
as the minimum value between u n — 1 and (lnn) 1 ^ 6 . We divide the sum 5«„_i 
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in two parts: the last S n terms and the other ones (if they exist). Let us recall 
that (a p ) p >i is increasing while p < M n . It implies 

"««— l ^ s- a u n -l , 0, Un -i-g n 
< On ' h Mn ' • 

Q"U n Q'Un Q"U n 

Let us first focus on the following factor: 



a Un y u n J In n 

Since <L < (lnn) 1_e , we have that S„ ■ a "" -1 < ,. 2 ,, . 

If <J„ = w n — 1, then S Un —i is negligible in front of a Un . Otherwise, if S n 
(lnn) 1_e , then 



au »- 1 - 5 < (2 Ur y+^ ■ fi- 1 + vin ") < 

Ou. \ Mr,. / V In 71 



/In n 



where a n £ &„ means that a n is smaller than a quantity equivalent to & n , asymp- 
totically when n tends to infinity. Thus 

Q / \ i + Vlnn „ 

< n • + vlnn ■ >• 0. 

a Un \ In n / In n 

So S^-i is negligible in front of a Un in that case too. Finally, B n . Un is equivalent 
to a Un , when n tends to infinity. □ 

Proof (of Lemma\^ assertion ^)). In the case u n > M n , it seems not true that 
both sums are equivalent to the larger term om„ .However, such a precise result 
is not necessary. 

Let r) n — min{u„ — M„,lnn}. In both bounds of Equation (J4)), we separate 
the sums in three parts: the first one from indices 1 to M n — 1; the second one 
from M n to M n + 77„; and the third one from M n + rj n + 1 to u n (this last sum 
can eventually be empty). 

Using assertion ([T]) of Lemma [5] and the fact that the second part of the sum 
contains the term om„ > we conclude that the first part of the sum is negligible. 
Let us now prove that the third part (when it is not empty: i.e. r\ n = Inn) is 
negligible too, in front of the second. 

Let us denote t n the third part of the sum divided by om„ . Since (a p ) is 
decreasing when p > M n + 1, we get: 

t= V V - 2-" MJ 2 A/ " < u {Mn + Vn)n 2- M "-"" MJ 2 M " 

Thus, the Stirling formula gives 

t„ < cxp f lnu„ + -7„(1 - ln(2)) + (r) n - M n ) In f 1 + -y- J - r?„ In (M„ + ry ?! 



15 



Since rj n — (9(lnn) and M n ~ ™/inn, we get: 

t n < exp {—rj n In n + r\ n ln(ln nj) > 0. 

Thus the third part of the sum is negligible and, as n tends to infinity, 

/ M n + V „ 



« E 



p" 



p! 2p 



a 



Proof (of Lemma\3\). In view of Lemma [5] applied to u n = fc n +i> if fc„+i < M„, 
we get 

Bn,k n + 1 ^ Ki+1 fen+l^ pfc^ + l-fc^+l ^ j; 
-Bn+l,fc„+i fcJJ+i fen+l' fcn+1 

Otherwise, using Lemma [2 when M n+ \ > k n+ \ > M n , there exists a constant 
a such that 



B n ,k n+1 < hn / M„ \ fen+I 



< a ^l ^L fe 



-Afn 



v n+l 



B n +i,k n+ i k n+ i \k n+ i 

l / i\,r \ n—M n — lfo 

fc n+1 VM n + J7„; n+1 

since M n + ??„ < fc n +i by definition of r) n , and fc n +i < M n+ \ by assumption. 
Therefore, 

B n ,k n+1 ^ I 



'+ 1 < 



-Bn+l,fc„ + i fcn+1 

and Lemma [3] is proved. □ 

Proof (of Lemma 0). We have fc n +i > M n+ i, so fc n +i > M n . We thus apply 
Lemma [21 assertion ^ with u n = k n +x- Consequently, there exists a constant 
a, for large enough n, such that: 

B n,k n + 1 < a2 M» +1 -M n +r ? » +1 Vn M™ {M n+1 + r/ n+1 )\ 

B n +l,k n +l ~~ Vn+1 (M n+ l + 1] n+ l) n+1 M n \ 

< n .oM„ +1 -M„ g» M " M »+l! 

Using the Stirling formula, both properties of Lemma [TJ and the fact that 
Vn/Vn+i tends to 1, we conclude that 

B n ,k„ + i _ flnn 



B n +i,k n+ i \ n 

and the stated result is proved. □ 
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B Probability of the class of projections 

Studying the probability of true is essential to understand the model while study- 
ing the projections is not necessary. However, it permits to be more familiar with 
the model and often permits to conjecture the general behaviour of P n (/). This 
gives a sufficient reason to deeply study P n (x) (x is a literal) . We will not detail 
all the proofs that are very similar to those of Section [SJ 

To calculate the probability of the class of projections we will follow the ideas 
presented for tautologies: we define a set of trees of simple shape that compute 
the projection x and call such trees "simple-x" and then show that the ratio of 
simple-x is, asymptotically when the size of the trees n tends to infinity, equal 
to the probability of the projection. 

Definition 13 (cf. Figure [2]). A simple- a; of type T is a tree with one subtree 
reduced to a single leaf and the other subtree being a simple tautology if the root 's 
label is A or a simple contradiction if the root 's label is V . 

A simple-x of type X is a tree with one subtree reduced to a single leaf £, 
the root labelled by A (resp. V) and the other subtree such that there exists a leaf 
labelled by the same literal as £ linked to the root by a V '-only path. 

We denote by X the family of simple-x. 

Obviously, simple-x are computing the projection x. 
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Fig. 2: Examples of simple-x. 



Lemma 7. If X^ is the number of type T simple-x of size n, we have, when n 

tends to infinity: 

,. X 3 

hm — — — -rat„. 

n->+oo T„ 8 
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Proof. We have: 

Xl A-2 n - 1 B n _ l>kn [z n - 1 }^s{zx,I{z))\ a 



T T 

because a type T simple- a; of size n is either a tree rooted by A or a tree rooted 
by V (which gives a factor 2), with either its right or its left subtree being a 
single leaf (which also gives a factor 2), and the other subtree being a simple 
tautology or a simple contradiction (depending on the root's label) of size n — 1. 
Remark that this equation is only true asymptotically when n tends to infinity, 
since we do double-counting which becomes negligible when n tends to infinity. 
Thus, asymptotically when n tends to infinity, 

Xl 4 • 2"- 1 B„_ 1 , fc „ [z n - x ]^s{zx,I{z))\ x=1 _ 2 • 2 n - 1 B n ^ kn 7„_ x 



J-n,k n ^ £>n,k n In ^ ^n,k n -*n 

We already have proved: J «/7 n ~ 3, and /«-i/j n = 1 /a, so the result is proved. 

□ 

Lemma 8. If X* is the number of type X simple-x of size n, we have, asymp- 
totically when n tends to infinity, 





rat 


lim -£- - 




n— s-+oo T n 


4 



Proof. We have: 

x x 4.2"- 1 B„_ 1 , fe „[z"- 1 ]^ S (zx,/(z))| x=1 



J-n ^ -£>n,k n -Ln 

because a type T simple- a; of size n is either a tree rooted by A or a tree rooted 
by V (which gives a factor 2), with either its right or its left subtree being a 
single leaf (which also gives a factor 2), and because the other subtree is a tree 
where we have chosen one S pattern leaf and labelled it by the same labelled 
as the first level leaf. Since there can be several S pattern leaves that can have 
simultaneously the same label as the leaf subtree, we do double counting, but 
once again, thanks to Lemma [5J this double counting becomes negligible when 
n tends to infinity. Thus, 

X n 4 • 2™~ Bn-l,fc n 1 

~T^ 2™B„, fc „ 8' 

Since [ 2 "" 1 ]^ s ( za; ! - f (2))|x=i//„ ~ 1 and T «-i/i n — V 8 : we § et tne result. □ 

Lemma 9. Asymptotically when n tends to infinity, the ratio of simple-x is 
equal to the probability of the projection. 

The proof of this lemma is very similar to the proof of Lemma [5] 
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C Probability of a general class of Boolean functions 

In the following, (/) is fixed (and / is one of its representative). T is an and/or 
tree computing /. Moreover, we will need to consider the patterns R = N^ r+1 ^ [N(B 
P] and R = N^+^^N® P) 2 ]. Note that the language N®P is defined such that 
the N © P-pattern leaves of a tree are its TV-pattern leaves plus its P pattern 
leaves. It is proved in [4] that this pattern language is indeed non-ambiguous 
and sub-critical for I ii N and P are. 

Proposition 3. A tree t computing f with at least one leaf on the (r + 2) th level 
of the R pattern must have at least R{f) + 1 R-repetitions. 

Proof. Let us assume that t computes /, has at least one leaf on the (r + 2) th 
level of the R pattern but have less than R(f) i?-repetitions. Let i be the smallest 
integer (smaller than r + 2) such that the number of N W -restrictions is equal to 
the number of N^ 1 ^ -restrictions. 

There must be either a repetition or an essential variable in the first level: 
if there is none, then we can assign all the N pattern leaves to false and this 
operation does not changes the calculated function. The calculated function is 
then the constant function false, which is impossible; so i < r + 1. 

First Case: Let us assume that there are strictly less than r N™ -restrictions. 
There is no repetition and no essential variable in the pattern leaves at level i. 
Therefore, we can assign them all to false and make the placeholders of the level 
i — 1 compute false. Let us replace those placeholders by false in the tree. Fur- 
thermore, replace by false all the non-essential remaining variables. And simplify 
the obtained tree to simplify all the constant leaves false and true. We obtain a 
tree t* , which still computes /, and whose leaves are all former N^ 1 ^ pattern 
leaves of t labelled by essential variables. The tree t* therefore contains strictly 
less than r leaves, which is impossible since the complexity of / is r. 

Second Case: Let us assume that t has exactly r TV^'-restrictions. Since i < r+1, 
there is no restriction in the placeholders of the level r + 2. Therefore, we can 
replace the placeholders by wildcards *, which means that those wildcards can 
be evaluated to true or false independently from each other and without changing 
the function computed by t. We can also replace the remaining leaves labelled 
by non-essential and non-repeated variables by such wildcards. 

We simplify those wildcards. Such a simplification has to delete at least one 
non-wildcard leaf. If we deleted a non-repeated essential variable, then the tree 
t* does not depend on this essential variable and computes /: this is impossible. 
Thus, we deleted a repetition: t* has strictly less than R(f) repetitions and 
computes /. It is impossible. □ 

Remark that in Lemma we only count repetitions and not restrictions 
as it was done in the original Lemma by Kozik. Because in terms of equiva- 
lence classes, essential variables are no longer relevant. Though, we will need to 
consider essential variables and the following lemma permits to handle them. 
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Lemma 10. Let L be an unambiguous pattern, sub-critical for T. Let f be a 
fixed Boolean functions and Mf the set of minimal trees computing f. Let £ 
be the family of trees obtained by expanding once a tree of Mf by trees having 
exactly p L -restrictions. Then, 

U n (£)~a-rat%W +p , 

with a > a constant. 

Proof. Let E n be the number of trees of size n in £ . We will denote by i the 
number of leaves that are involved in the p L-restrictions of the expansion tree: 
i is at least p + 1 and at most 2p. With negligible double-counting, 



/; ( c \ V \~ n - L (f>] (p( T ~ nvm 


&n—p-R(f)M n 


/M £ J T - Z^ \- Z \l\Q x i{ t{ < XZ > 1[ < Z )))\x=l 

i=p+l 


Z ln±3n,k n 


Since L is sub-critical for A, 




i— p+1 


(r\ Ll » 



asymptotically when n tends to infinity. Therefore, in view of Section HI 

M „(£)~a-rat^+ p . 



n 



Consider the family of trees obtained by replacing a subtree s by s A i e where 
i e is a simple tautology into a minimal tree of /. Let us denote by E n the number 
of such trees of size n. Since a simple tautology has at least one S'-restriction, 
thanks to \M 

^~a- rat fl(/)+1 

n 

Thanks to Lemma[5] we know that terms computing / with more than R(f) + 
2 repetitions are negligible in front of the above family. Therefore, since trees 
with no leaf on the (r + 2) th level are negligible, we proved Theorem [TJ 

In fact, we can show a more precise result: 

Theorem 2. Let f be a fixed Boolean function, then, asymptotically when n 
tends to infinity, 

P„a)~A (/) r a t^>+\ 

where X^f^ is a positive constant. 

The key point of the proof of this Theorem is that a typical tree computing 
a function from (/) is a minimal tree of this function which has been expanded 
once. In the following, we will only consider two different expansions: 
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root 



root 





Fig. 3: An expansion at node v. Note that the expansion tree t e could have been 
on the right size of the o-connective instead of its left side. 



Definition 14 (cf. Figure [3]). Recall that an expansion of a tree t is a tree 
obtained by replacing a subtree s of t by s ot e (or t e o s) where o G {A, V}. 

An expansion is a T-expansion if the expansion tree t e is a simple tautology 
and the connective o is A (or a simple contradiction and the connective o is V ). 

An expansion is a X-expansion if the expansion tree t e has a leaf linked to 
the root by a A-path (resp. a \l -path) and the o connective is a V (resp. A). 

Lemma 11. The ratio of minimal trees of f expanded once verifies, asymptot- 
ically when n tends to infinity 



„{E[Mf}) = a • ratf '> +1 + o (rat. 



i~R(/)+i 



//. 



This lemma is a direct consequence of Lemma 1101 

Lemma 12. Let f be a fixed Boolean function and let Aif be the set of minimal 
trees of f . 

^n(f) ~ ^n(E[Mf\) when n -> +oo. 

Proof. Let t be a term computing /. Such a term must have at least R(f) + 1 
/^-repetitions. Moreover, thanks to Lemma trees with at least R(f) + 2 R- 
repetitions are negligible. We will show that a tree with exactly R(f) + 1 i?- 
repetitions is in fact a minimal tree expanded once. 

The term t must also have R(f) + 1 -/^-repetitions and therefore, there is no 
additional repetition when we consider the (r + 3) st level of the -R-pattern. 

Let i be the first level such that the number of JV"W restrictions is equal to 
the number of iV^^-restrictions. Since there must be a restriction on the first 
level, i < r + 1. 



First Case: Assume that an essential variable a appears on the pattern leaves of 
the (r + 3) th level. Therefore, t has at most L(f) ATQ)-restrictions. Let us replace 
the placeholders of the (i — l) th level by false and assign all the remaining non- 
essential variables to false. Simplify the tree to obtain a new and/or tree denoted 
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by t*. The leaves of this tree are former A^ l_1 )-pattern leaves of t, labelled by 
essential variables and t* still computes /. But the variable a is essential for /: 
thus it must still appear in the leaves of t* , and by deleting its occurence in the 
leaves of the (r + 3) th level, we deleted one repetition. Therefore, t* has at most 
L(f) — 1 leaves which is impossible! 

Second Case: There is no essential variable among the the pattern leaves of the 
(r + 3) th level. Since there is also no repetition at this level, we can replace the 
placeholders of the level (r + 3) to wildcards. We also replace the remaining non 
essential and non-repeated variables by wildcards. We then simplify the wild- 
cards and obtained a simplified tree £*, computing /, with no wildcards and 
which leaves are former leaves of the trees t, essential or repeated. During the 
simplification process, we have deleted at least one of these leaves and therefore 
t* has at most L(f) leaves: it is a minimal tree of /. 

Let us consider the following fact: The lowest common ancestor of all the wild- 
cards in t has been suppressed during the simplification process. 
Assume that this fact is false: then two wildcards have been simplified inde- 
pendently during the simplification process, and thus, at least two essential or 
repeated variables have been deleted. The tree t* has thus at most L(f) — 1 
leaves and computes /, which is impossible since L(f) is the complexity of /. 

Let us denote by t e the subtree rooted at v the lowest common ancestor of 
the wildcards. We have shown that a typical tree computing / is a minimal tree 
of / in which we have plugged an expansion tree t e which does not change the 
function /. □ 

Lemma 13. Let t be a typical tree computing f . The expansion tree t e is either 
a simple tautology (or simple contradiction), or an x-expansion -i.e. a tree with 
one A-leaf (resp. W-leaf) labelled by an essential variable oft. 

Proof. As shown in the former lemma, a typical tree computing / is a minimal 
tree of / on which has been plugged an expansion tree t e . 

First Case: Let us assume that t e has no (N © P)-repetition and no essential 
variable among its (JV©P)-pattern leaves. Then, we can replace t e by a wildcard 
and simplify this wildcard. This simplification suppresses at least one other leaf 
of the tree: the obtained tree is then smaller than the original minimal tree, and 
still computes /. It is impossible. 

Second Case: Let us assume that t e has at least two (N © P) 2 -restrictions. 
Thanks to Lemma \W[ this family of expanded trees is negligible. 

Third Case: Let us assume that t e has exactly one (N © P) 2 -restrictions. Then 
it must be a N © P- restriction (cf. First Case) . 

— if it is a repetition, than one can show that it must be a simple tautology or 
a simple contradiction. 

— if it is an essential variable, one can show that it must be an X-expansion. 

□ 



